Non-relevance Feedback Document Retrieval using Large Data Set
نویسندگان
چکیده
In interactive document retrieval, we need to find relevant documents to our interest from a large data set of documents, within a few iterations of judgement on retrieved documents. In each iteration, a comparatively small batch of documents is evaluated to establish their relevance to user’s interest. This method is also called relevance feedback, and it requires both of relevant and non-relevant documents. However, the documents initially presented for user’s judgement do not always include relevant documents. Thus we have proposed a feedback method using information on non-relevant documents only and named this method “non-relevance feedback”. Nonrelevance feedback selects a set of documents which are discriminated not non-relevant region and are near the discriminant hyperplane based on learning result by One-class Support Vector Machine (One-class SVM). We conducted experiments using large data sets including over 500,000 newspaper articles and confirmed that the proposed method outperformed other methods.
منابع مشابه
Document Image Retrieval Based on Keyword Spotting Using Relevance Feedback
Keyword Spotting is a well-known method in document image retrieval. In this method, Search in document images is based on query word image. In this Paper, an approach for document image retrieval based on keyword spotting has been proposed. In proposed method, a framework using relevance feedback is presented. Relevance feedback, an interactive and efficient method is used in this paper to imp...
متن کاملAn One Class Classification Approach to Non-relevance Feedback Document Retrieval
This paper reports a new document retrieval method using non-relevant documents. From a large data set of documents, we need to find documents that relate to human interesting in as few iterations of human testing or checking as possible. In each iteration a comparatively small batch of documents is evaluated for relating to the human interesting. The relevance feedback needs a set of relevant ...
متن کاملRelevance Feedback Document Retrieval using Non-Relevant Documents
This paper reports a new document retrieval method using non-relevant documents. From a large data set of documents, we need to find documents that relate to human interesting in as few iterations of human testing or checking as possible. In each iteration a comparatively small batch of documents is evaluated for relating to the human interesting. This method is called relevance feedback. The r...
متن کاملCLEF-2005 CL-SR at Maryland: Document and Query Expansion using Side Collections and Thesauri
This paper reports results for the University of Maryland’s participation in CLEF-2005 Cross-Language Speech Retrieval track. Techniques that were tried include: (1) document expansion with manually created metadata (thesaurus keywords and segment summaries) from a large side collection, (2) query refinement with pseudo-relevance feedback, (3) keyword expansion with thesaurus synonyms, and (4) ...
متن کاملUMass Genomics 2006: Query-Biased Pseudo Relevance Feedback
Query-biased pseudo relevance feedback creates document representations for document feedback that aim to be more relevant to the user than using the entire document. Our submitted runs using querybiased feedback degraded performance compared to not using feedback. The cause of this degradation was the use of too many documents for feedback. Preliminary document retrieval experiments using fewe...
متن کامل